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Abstract. We propose a definition of QNC, the quantum analog of the 
efncient parallel class NC. We exhibit several useful gadgets and prové 
that various classes of circuits can be parallelized to logarithmic depth, 
including circuits for encoding and decoding Standard quantum error- 
correcting codes, or more generally any circuit consisting of controlled- 
not gates, controlled 7r-shifts, and Hadamard gates. Finally, while we note 
the Quantum Fourier Transform can be parallelized to linear depth, we 
conjecture that an even simpler 'staircase' circuit cannot be parallelized 
to less than linear depth, and might be used to prové that QNC < QP. 



1 Introduction 

Much of computational complexity theory has focused on the question of what 
problems can be solved in polynomial time. Shor's quantum factoring algorithm 
[ fÏ4| suggests that quantum computers might be more powerful than classical 
computers in this regard, i.e. that QBP might be a larger class than P, or rathcr 
BP, the class of problems solvable in polynomial time by a classical probabilistic 
Turing machine with bounded error. 

A finer distinction can be made between P and the class NC of efficient 
parallel computation, namely the subset of P of problems which can be solved by 
a parallel computer with a polynomial number of processors in polylogarithmic 
time, ü(log fe n) time for some k, where n is the number of bits of the input 
[ fÏ2| . Equivalently, NC problems are those solvable by Boolean circuits with a 
polynomial number of gates and polylogarithmic depth. 

This distinction seems especially relevant for quantum computers, where de- 
coherence makes it difBcult to do more than a limited number of computation 
steps reliably. Since decoherence due to storage errors is essentially a function 
of time, we can avoid it by doing as many of our quantum operations at once as 
possible; if we can parallelize our computation to logarithmic depth, we can solve 
exponentially larger problems. (Gate errors, on the other hand, will typically get 
worse, since parallel algorithms often involve more gates.) 

In this paper, we propose a definition of QNC and prové a number of ele- 
mentary results. Our main theorem is that circuits consisting of controlled- not 



gates, controlled 7r-shifts, and Hadamard gates can be parallelized to logarith- 
mic depth. This includes circuits for encoding and decoding Standard quantum 
error-correcting codes. We end with a conjecture that a simple 'staircase' circuit 
cannot be parallelized, and so might be used to prové that QNC < QP. 

2 Definitions 

We define quantum operators and quantum circuits as follows: 

Deflnition 1. A quantum operator on n qubits is a unitary rank-ln tensor U 
where U^^ 2 '"^ is the amplitude of the incoming and outgoing truth vàlues being 
dl, a,2, ■ ■ ■ a n and b\, 62, ■ ■ ■ b n respectively, with ai, bi £ {0, 1} for all i. However, 
we will usually write U as a 2" x 2™ unitary matrix U a b where a and b 's binary 
reprès ■entations are a\ai • ■ ■ a n and b\b^ • • ■ b n respectively. 

A one-layer circuit consists of the tensor product of one- and two-qubit gates, 
i.e. rank 2 and 4 tensors, or 2x2 and 4x4 unitary matrices. This is an operator 
that can be carried out by a set of simultaneous one-qubit and two-qubit gates, 
where each qubit interacts with at most one gate. 

A quantum circuit of depth k is a quantum operator written as the product 
of k one-layer circuits. 

Here we are allowing arbitrary two-qubit gates. If we like, we can restrict 

(100 \ 
un u'L ) ' or more stringently to the 
u 2 i «22/ 

(1 o\ 
1 ) • ^ or these, we will call the first and second qubits 
0010/ 

the input and target qubit respectively, even though they don't really leave the 
input qubit unchanged, since they entangle it with the target. 

Since either of these can be combined with one-qubit gates to simulate ar- 
bitrary two-qubit gates jjj, these restrictions would just multiply our definition 
of depth by a constant. The same is true if we wish to allow gates that couple 
k > 2 qubits as long as k is fixed, since any fc-qubit gate can be simulated by 
some constant number of two-qubit gates. 

In order to design a shallow parallel circuit for a given quantum operator, 
we want to be able to use additional qubits or "ancillae" for intermediate stcps 
in the computation, equivalent to additional processors in a parallel quantum 
computer. However, to avoid entanglement, we demand that our ancillae start 
and end in a pure state |0), so that the desired operator appears as the diagonal 
block of the operator performed by the circuit on the subspace where the ancillae 
are zero. 

Then in analogy with NC we propose the following definition: 

Definition 2. Let F be a family of quantum operators, i.e. F(n) is a 2" x 2™ 
unitary matrix on n qubits. We say that F(n) is embedded in an operator M 
with m ancillae if M is a 2 m+n x 2 m+n matrix which preserves the subspace 



where the ancillae are set to \0), and if M is identical to F{n) ® l 2 when 
restricted to this subspace. 

Then QNC = UfcQNC fc where QNC fe is the class of operators parallelizable 
to O(log fc n) depth with a polynomíal number of ancillae. That is, F is in QNC fc 
if, for some constants c\, C2 and j , F(n) can be embedded in a circuit of depth 
at most ci \og k n, with at most c^n^ ancillae. 

To extend this definition from quantum operators to decision problems in the 
classical sense, we would have to choose a measurement protocol, and to what 
extent we want errors to be bounded. We will not explore those issues here. 
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Fig. 1. Our notation for controlled-not, controlled-£/, symmetric phase shift, and 
arbitrary diagonal gates. 



We will use the notation in figure 1 for our various gates: the controlled-not 
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gates 
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A preliminary version of this work, lacking proposition |^ and all of section 7 
on quantum codes, appeared as uB. 



3 Permutations 



In classical circuits, one can move wires around as much as one likes. In a quan- 
tum computer, it may be more difficult to move a qubit from place to place. 
However, we can easily do arbitrary permutations in constant depth: 

Proposition 1. Any permutation of n qubits can be performed in 4 layers of 
controlled-not gates with n ancillae, or in 6 layers with no ancillae. 



Proof. The first part is obvious; simply copy the qubits into the ancillae, cancel 
the originals, recopy them from the ancillae in the desired order, and cancel the 
ancillae. This is shown in figure ^|. 

Without ancillae, we can use the fact that any permutation can be written 
as the composition of two sets of disjoint transpositions |0. To see this, first 
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Fig. 2. Permuting n qubits in 4 layers using n ancillac. 
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Fig. 3. Any cycle, and therefore any permutation, is the composition of two sets 
of disjoint transpositions. 




Fig. 4. Switching two qubits with three controlled-nots. 



decompose it into a product of disjoint cycles, and then note that a cycle is the 
composition of two reflections, as shown in figure]^. Two qubits can be switched 
with 3 layers of controllcd-not gates as shown in figuro ^, so any permutation 
can be done in 6 layers. □ 



4 Fan-out 

To make a shallow parallel circuit, it is often important to fan out one of the 
inputs into múltiple copies. The controlled-not gate can be used to copy a qubit 
onto an ancilla in the pure state |0) by making a non-destructive measurement: 

(a\0)+(3\l)) ® |0) -» a\00)+(3\lí) 

Note that the final state is not a tensor product of two independent qubits, since 
the two qubits are completely entangled. Making an unentangled copy requires 
non-unitary, and in fact non-linear, processes since 

HO) + 011» cg> (a\0) + fll» = a 2 |00) + aj8(|01) + |10>) + (3 2 \11) 

has coefficients quadratic in a and (3. This is the clàssic 'no cloning' theorem. 

This means that disentangling or uncopying the ancillae by the end of the 
computation, and returning them to their initial state |0), is a non-trivial and 
important part of a quantum circuit. Thcre are, however, some special cases 
where this can be done easily. 

Suppose we have a series of n controlled-í7 gates all with the same input 
qubit. Rather than applying them in series, we can fan out the input into n 
copies by splitting it log 2 n times, apply them to the target qubits, and uncopy 
them afterward, thus reducing the circuit's depth to O(logn) depth. 

Proposition 2. A series of n controlled gates coupling the same input to n 
target qubits can be parallelized to O(logn) depth with 0(n) ancillae. 

Proof. The circuit in figure|| copies the input onto n— 1 ancillae, applies all the 
controlled gates simultaneously, and uncopies the ancillae back to their original 
state. Its total depth is 2 log 2 n + 1. □ 

This kind of symmetric circuit, in which we uncopy the ancillae to return 
them to their original state, is similar to circuits designed by the Reversible 
Computation Group at MIT [|| for reversible classical computers. 

5 Diagonal and mutually commuting gates 

Fan-in seems more difficult in general. Classically, we can calculate the com- 
position of n operators in O(logn) time by composing them in pairs; but it is 
unclear when we can do this with unitary operators. One special case where it 
is possible is if all the gates are diagonal: 
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Fig. 5. Parallelizing n controlled gates on a single input qubit q to O(logn) 
dcpth. 
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Fig. 6. Using entanglement to parallelize diagonal operators. 
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Fig. 7. Parallelizing n diagonal gates on a single qubit as in proposition |2[ 



Proposition 3. A series of n diagonal gates on the same qubits can be paral- 
lelized to O(logn) depth with 0{n) ancillae. 

Proof. Here the entanglement between two copies of a qubit becomes an asset. 
Since diagonal matrices don't mix Boolean states with each other, we can act on 
one or more qubits and an entangled copy of them with two diagonal matrices 
Di and Di as in figure ^. When we uncopy the ancilla(e), we have the same 
effect as if we had applied both matrices to the original qubit (s). Then the same 
kind of circuit as in proposition ^| works, as shown in figure ^. □ 

Since matrices commute if and only if they can be simultaneously diagonal- 
ized, we can generalize this to the case where a set of controlled-f7 gates applied 
to the same target qubit(s) have mutually commuting C/'s: 

Proposition 4. A series of of n controlled-U gates acting on the same target 
qubit(s) where the U's mutually commute can be parallelized to 0(log?i) depth 
with 0(n) ancillae. 

Proof. Since the U's all commute, they can all be diagonalized by the same 
unitary operator T. Apply to the target qubit (s), parallelize the circuit using 
proposition ||, and put the target qubit (s) back in the original basis by applying 
T. This is all donc with a circuit of depth 2 log 2 n + 3. □ 
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Fig. 8. Applying an operator U q times, where q is given in binary by the input 
qubits. 



As an example, in figure ^ we show a circuit that applies the gth power of 
an operator U to a target qubit, wherc < q < 2 k is given by k input qubits 
as a binary integer. We can do this because U, U 2 , U 4 , . . . can be simultaneously 
diagonalized, since U q = T^D^T. 

We can extend this to circuits in general whose gates are mutually commut- 
ing, which includes diagonal gates: 

Proposition 5. A circuit of any size consisting of diagonal or mutually com- 
muting gates, each of which couples at most k qubits, can be parallelized to depth 
0(n k ^ 1 ) with no ancillae, and to depth O(logn) with 0{n k ) ancillae. Therefore, 
any family of such circuits is in QNC 1 . 

Proof. Since all the gates commute, we can sort them by which qubits they cou- 
ple, and arrive at a compressed circuit with one gate for each /c-tuple. This gives 
(fc) = 0(n k ) gates, but by performing groups of n/k disjoint gates simultane- 
ously we can do all of them in depth ©(r^" 1 ). 

By making — O^n^ 1 ) copies of each qubit, we can apply each gate 

to a disjoint set of copies as in propositions || and ^ to reduce this further to 
C(log n) depth. □ 

This is hardly surprising; after all, diagonal gates commute with each other, 
which is almost like saying that they can be performed simultaneously. 

6 Circuits of controlled-not gates 

We can also fan in controlled-not gates. Figure || shows how to implement n 
controlled-not gates on the same target qubit in depth 2 log 2 n + 1. The ancillae 
carry the intermediate exclusive-ors of the inputs, and we combine them in pairs. 
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Fig. 9. Parallelizing n controlled-not gates to C(log n) depth by adding them in 
pairs. 



We can use a generalization of this circuit to show that any circuit composed 
entirely of controlled-not gates can be parallelized to logarithmic depth: 



Proposition 6. A circuit of any size on n qubits composed entirely of controlled- 
not gates can be parallelized to O(logn) depth with 0(n 2 ) ancillae. Therefore, 
any family of such circuits is in QNC 1 . 

Proof. First, note that in any circuit of controllcd-not gates, if the n input qubits 
have binary vàlues and are given by an n-dimensional vector q, then the output 
can be written Mq where M is an n x n matrix over the integers mod 2. Each 
of the output qubits can be written as a sum of up to n inputs, (Mq)i = ^ fc qj k 
where jt are those j for which = 1. 

We can break these sums down into binary trees. Let W n be the completo 
output sums, W n /2 be their left and right halves consisting of up to n/2 inputs, 
and so on down to single inputs. There are less than n 2 such intermediate sums 
Wfc with k > 1. We assign an ancilla to each one, and build them up from the 
inputs in log 2 n stages, adding pairs from Wk to make W 2 k- The first stage takes 
O(logn) time and an additional 0{n 2 ) ancillae since we may need to make 0(n) 
copies of each input, but each stage after that can be done in depth 2. 
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Fig. 10. Parallelizing an arbitrary circuit of controlled-not gates to logarithmic 
depth. 



To cancel the ancillae, we use the same cascade in reverse order, adding pairs 
from Wk to cancel W 2 k- This lcaves us with the input q, the output Mq, and 
the ancillae set to zero. 

Now we use the fact that, since the circuit is unitary, M is invertible. Thus 
we can recalculate the input q = M^(Mq) and cancel it. We use the same 
ancillae in reverse order, building the inputs q out of Mq with a series of partial 



suros V2, V4, . . ., cancel q, and cancel the ancillae in reverse as before. All this is 
illustrated in figure [Ï0[ 

This lcavcs us with the output Mq and all other qubits zero. With four morè 
layers as in proposition [ï], we can shift the output back to the input qubits, and 
we're done. □ 

This result is hardly surprising; after all, these circuits are reversible Boolean 
circuits, and any classical circuit composed of controlled-not gates is in NC 1 (in 
fact, in the class ACC°[2] of constant-depth circuits with sum mod 2 gates of 
unboundcd fan- in). We just did a little extra work to disentangle the ancillae. 



7 Controlled-not gates and phase shifts 

We have shown that circuits composed of diagonal or controlled-not gates can be 
parallelized. It's reasonable to ask whether propositions [|and|6]can be combined; 
that is, whether arbitrary circuits composed of controlled-not gates and diagonal 
operators can be parallelized to logarithmic depth. In this section, we will show 
that this is not the case. 

Proposition 7. Any diagonal unitary operator on n qubits can be performed by 
a circuit consisting of an exponential number of controlled-not gates and one- 
qubit diagonal gates and no ancillae. 

Proof. Any diagonal unitary operator on n qubits consists of 2™ phase shifts, 




. If we write the phase angles as a 2"-dimensional vector u>, 



thcn the cffcct of composing two diagonal operators is simply to add these vectors 
mod 2ir. 

For each subset s of the set of qubits, define a vector /x s as +1 if the number 
of true qubits in s is even, and —1 if it is odd. If s is all the qubits, for instance, 
l· i {\...n} is the aperiodic Morse sequence (+1, —1, —1, +1, . . .) when written out 
linearly, but it really just means giving the odd and even nodes of the Boolean 
n-cube opposite signs. 

It is easy to see that the /i s for all s C {1, . . . , n} are linearly independent, 
and form a basis of R 2 . Moreover, while diagonal gates coupling k qubits can 
only perform phase shifts spanned by those /i s with |s| < k, the circuit in figure 
p"l·| can perform a phase shift proportional to fi s for any s (incidentally, in depth 
ü(log |5|) with no ancillae). Therefore, a series of 2™ such circuits, one for each 
subset of {1, . . . , n}, can express any diagonal unitary operator. □ 

This exponential bound is necessary in the worst case: 

Proposition 8. There are diagonal operators that cannot be parallelized to less 
than exponential depth with a polynomial number of ancillae. 



Fig. 11. A circuit for the phase shift 6pL s 
of true qubits is even and —9 if it is odd. 



i.e. a phase shift of +6 if the number 



Proof. Consider setting up a many-to-one correspondence between circuits and 
operators. The set of diagonal unitary operators on n qubits has 2™ continuous 
degrees of freedom, while the set of circuits of depth d with m ancillae has only 
0(d(m + n)) continuous degrees of freedom (and some discrete ones for the 
circuit's topology). Thus if m is polynomial, d must be exponential. □ 

However, the next proposition shows that this won't help us distinguish QP 
from QNC. In fact, for controlled-nots and diagonal gates, QP and QNC are 
identical: 

Proposition 9. Any circuit consisting of controlled-not gates and m diago- 
nal operators coupling k qubits each can be parallelized to O(logn) depth with 
0(max(fcmn, n 2 )) ancillae. Therefore, any such circuit of polynomial size 0{n c ) 
can be parallelized to O(logn) depth with 0{kn c+r ) ancillae, and any family of 
such circuits with fixed k is in QNC 1 . 

Proof. Any such circuit can be written as the product of a circuit of controlled- 
not gates and a diagonal matrix that takes care of the phase shifts. The hrst 
part we can parallelize as in proposition [|, to O(logn) depth and 0(n 2 ) ancillae. 
As proposition || shows, diagonal matrices cannot be parallelized in general, so 
we have to look at the circuit more closely. 

We can write the circuit we are trying to parallelize as a product M = 
M PiMiP 2 M 2 ■ ■ ■ P m M m where the Mj consist only of controlled-not gates and 
the Pi are the diagonal operators. By passing the Pj to the right end of the 
circuit, we can write 

M = M • ■ • M m •£>!••• D m 
where Di is the diagonal operator 

D t = (M i ···M ro ) t P i (M i ···M ro ) 

In other words, we simply calculate what state the controlled-not circuit was in 
when Pi was applied, apply it, and uncalculate. 



Each one of the k qubits coupled by Pi is the exclusive-or of some subset 
of the inputs, and can be calculated with a binary tree of 0(n) ancillae as in 
proposition |^. Finally, by proposition [| we can apply all the Di at once, by 
making m copies of the system's entire state. Thus the total number of ancillae 
needed is 0(kmn), or 0{kn c+1 ) if m = 0{n c ). □ 



8 The Hadamard gate, the Clifford group, and quantum 
codes 

So far, all the circuits we have looked at are essentially classical; each row and 
each column has only one non-zero entry, so they are just reversible Boolean 
functions with phase shifts. Obviously, any interesting quantum algorithm will 
involve mixing between different Boolean states. 

The simplest such operator is the Hadamard gate R — ^ . By applying 

it all n qubits of a state 1 000 • ■ • 0), we can prepare them in a superposition of all 
2 n possible states. It is also the bàsic ingredient, along with phase shifts, of the 



Standard circuit (shown below in figuro 18) for the Quantum Fourier Transform. 



We will call a controlled-í7 gate a controlled-Pauli gate if U is one of the Pauli 
matrices a x = |j jj, — ia y = ^ q 1 ^ , or a z = . Note that a controlled-X 

is simply a controlled-not, a controlled-Z is just the symmetric 7r-shift, and this 
real version of the controlled-F is their product. 




Fig. 12. Relations between the 7r-shift, the controlled-not, and the w gate, which 
we notate with a wiggle. 



The 7r-shift can be written in terms of a controlled-not by conjugating the 
target with R. Conjugating the input qubit instead gives us the 7r-shift in the 



Hadamard basis, which is a symmetric gate ^ 

w-gate, and notate it as in figure |Ï2[ 
Then we have the following: 




We call this the 



Proposition 10. Circuits of any size consisting of controlled-Pauli gates and 
the Hadamard gate R can be parallelized to O(logn) depth with 0(n 2 ) ancillae. 
Thus any family of such circuits is in QNC 1 . 




Fig. 13. Step 1: combing R's to the right through controlled-nots, 7r-shifts, and 
w gates. 
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Fig. 14. Step 2: commuting 7r-shifts and w's past each other, and combining 
them into 1, z, w, zw, wz, or zwz. 



Proof. We will use the algebraic relations between these gates to arrange them 
into easily parallelizable groups. In step 1, we move Hadamard gates to the right 
through the other gates as shown in figure |ï^. This leaves a circuit of controlled- 
nots, 7r-shifts, and w-gates, followed by a single layer of R's and identities. 

In step 2, we arrange 7r-shifts and w-gates into three groups: a set of ir- 
shifts, a set of w's, and another set of 7r-shifts, with controlled-nots interspersed 
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Fig. 15. Stcp 3: commuting 7r-shifts and w's past controlled-nots. 



throughout. We can do this since when these gates are applied to different pairs 
of qubits, they commute (up to the creation of an additional controlled-not), and 
when applied to the same pair, generate a finite group. Specifically, if we call 
the 4x4 matrix of the 7r-shift z, then z and w obey the relations z 2 = w 2 = 1 
and wzw = zwz, and generate the permutation group on three elements S3 = 
{1, z, w, zw, wz, zwz}. Thus a group of z's, a group of w's, and a group of z's 



are sufhcient. These relations are shown in figure 14 



In step 3, we pull the controlled-nots to the left through the z's and w's as 
shown in figure ^5|. This makes some additional symmetric gates, but always 
of the same type we pull through, so the grouping of z's, w's and z's is not 
disturbed. (We also sometimes create single-qubit gates X and Z, but these can 
be thought of as controlled-nots or 7r-shifts whose control qubit is always true.) 

Finally, we note that since w is simply z in the Hadamard basis as shown in 



figure 12, we can write the group of w's as a group of z's conjugated with R on 



every qubit. We are left with a circuit of controlled-not gates, followed by three 
groups of 7r-shifts separated by two layers of R's, and a single layer of possible 
R's as shown schematically in figure [Ï6[ 

Propositions |5| and ^| show how to parallelize circuits of 7r-shifts and of 
controlled-nots to O(log n) depth with 0(n 2 ) ancillae, and the theorem is proved. 

□ 

With a little extra work we should also be able to include the one-qubit ir/2 
shift P = ío ?V This would give us the Clifford group, which is the normalizer 
of the group of tensor produets of Pauli matrices. In fact, some of the relations 
we have used here are equivalent to those used by Gottesman to derive the 
Heisenberg representation of circuits in the Clifford group . 
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Fig. 16. The kind of circuit we are left with after steps 1, 2, and 3, and after 
writing the ui's as 7r-shifts conjugated by R. 



There may be other interesting finite subgroups of 0(2") that we can par- 
allelize. However, if we add the two-qubit controlled-P gate (also known as 
the 'square-root-of-not') we get universal computation, i.e. we can generate a 
dense set of quantum operators. Algebraically, this shows up as the fact that 
the controlled-Z gate is the only two-qubit phase shift whose conjugate by a 
controlled-not can be expressed with two- and one-qubit gates, just as P and Z 
are the only one-qubit phase shifts whose conjugate by a controlled-not can be 
expressed with themselves and controlled-Z gates. Other phase shifts generate 
three- and more-qubit interactions when they are commuted through controlled- 
not s. 

In any case, this gives us the following corollary. 

Corollary 1. Additive (or 'stabilizer') quantum error- correcting codes are in 
QNC 1 , in the sense that encoding and decoding famílies of such codes with n- 
qubit code words can be done in O(logn) depth and 0(n 2 ) ancillae. 

Proof. Since the Pauli matrices o~ x and o~ z generate bit errors and phase errors 
respectively, circuits for quantum codes such as those in [|ï||||,|] are composed 
of controlled-Pauli and Hadamard gates. By a result of Rains [13 , additive quan- 
tum codes are always equivalent to real ones, so the real version of the controlled- 
Y gate is sufHcient. □ 

In fact, Cleve and Gottesman || and Steane [[Ï6| have shown that circuits for 
additive quantum codes can be constructed out of controlled-Pauli gates, where 
Hadamard gates appear only in one or two layers. Thus proposition ^ is already 
enough to parallelize these circuits. 



9 QNC ^ QP? The staircase circuit 



A simple, perhaps minimal, example of a quantum circuit that seems hard to 
parallelize is the "staircase" circuit shown in figure O. This kind of structure 



* » 



Fig. 17. These "staircase" circuits seems hard to parallelize unless the operators 
are purely diagonal or off-diagonal. 

appears in the Standard circuit for the quantum Fourier transform, which has 
0(n 2 ) gates [fï||Ï4||. Careful inspection shows that the QFT can in fact be par- 
allclized to 0(n) depth as shown in figure |Ï§1 (an upside-down version of which 
is given in ||), but it seems difncult to do any better. Clearly, any fast parallel 
circuit for the QFT would be relevant to prime factoring and other problems the 
QFT is used for. 

If we define QP as the family of quantum operators that can be expressed 
with circuits of polynomial depth (again, leaving measurement issues aside for 
now), we can make the following conjecture: 

Conjecture 1. Staircase circuits composed of controlled-C/ gates other than di- 
agonal or off-diagonal gates (i.e. other than the special cases handled in propo- 
sitions | and |) cannot be parallelized to less than linear depth. Therefore, 
QNC < QP. 



10 Conclusion 

We conclude with some qüestions for further work. 

Does parallelizing the encoding and decoding of error-correcting codes help 
reduce the error threshold for reliable quantum computation, at least in regimes 
where storage errors are more significant than gate errors? 

Parsing classical context- free languages is in NC, and quantum context-free 
languages have been defined in [ jïo| . Is quantum parsing, i.e. producing derivation 
trees with the appropriate amplitudes, in QNC? 

Finally, can the reader show that the staircase circuit cannot be parallelized, 
thus showing that QNC < QP? This would be quite significant, since corre- 
sponding classical question NC < P is still open, and believed to be very hard. 
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Fig. 18. The Standard circuit for the quantum Fourier transform on n qubits 
can be carried out in 2n — 1 layers. Can it be parallelized to less than linear 
depth? 
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